Goto

Collaborating Authors

 event-triggered mechanism


Appendices ASketchofTheoreticalAnalyses

Neural Information Processing Systems

Theorem B.1 (Performance difference bound for Model-based RL). Mi denote the inconsistency between the learned dynamics PMi and the true dynamics, i.e. ϵ For L1 L3, with the performance gap approximation of M1 and π1, we apply Lemma C.2, and Here, dπMi denotes the distribution of state-action pair induced by policy π under the dynamical modelMi. Theorem B.3 (Refined bound with constraints). Let µ and v be two probability distributions on the configuration space X, according to LemmaC.1,thenwehaveDTV(µ Under these definitions, we can yield the following intermediate outcome by applying the results from B.2and B.1 Here, we take the time-varying linear quadratic regulator as an instance for illustrating the rationality of our assumption on α.




When to Update Y our Model: Constrained Model-based Reinforcement Learning Tianying Ji1, Y u Luo

Neural Information Processing Systems

Designing and analyzing model-based RL (MBRL) algorithms with guaranteed monotonic improvement has been challenging, mainly due to the interdependence between policy optimization and model learning. Existing discrepancy bounds generally ignore the impacts of model shifts, and their corresponding algorithms are prone to degrade performance by drastic model updating. In this work, we first propose a novel and general theoretical scheme for a non-decreasing performance guarantee of MBRL. Our follow-up derived bounds reveal the relationship between model shifts and performance improvement. These discoveries encourage us to formulate a constrained lower-bound optimization problem to permit the monotonicity of MBRL. A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns. Motivated by these analyses, we design a simple but effective algorithm CMLO (Constrained Model-shift Lower-bound Optimization), by introducing an event-triggered mechanism that flexibly determines when to update the model. Experiments show that CMLO surpasses other state-of-the-art methods and produces a boost when various policy optimization methods are employed.


Byzantine-Resilient Output Optimization of Multiagent via Self-Triggered Hybrid Detection Approach

Yan, Chenhang, Yan, Liping, Lv, Yuezu, Dong, Bolei, Xia, Yuanqing

arXiv.org Artificial Intelligence

How to achieve precise distributed optimization despite unknown attacks, especially the Byzantine attacks, is one of the critical challenges for multiagent systems. This paper addresses a distributed resilient optimization for linear heterogeneous multi-agent systems faced with adversarial threats. We establish a framework aimed at realizing resilient optimization for continuous-time systems by incorporating a novel self-triggered hybrid detection approach. The proposed hybrid detection approach is able to identify attacks on neighbors using both error thresholds and triggering intervals, thereby optimizing the balance between effective attack detection and the reduction of excessive communication triggers. Through using an edge-based adaptive self-triggered approach, each agent can receive its neighbors' information and determine whether these information is valid. If any neighbor prove invalid, each normal agent will isolate that neighbor by disconnecting communication along that specific edge. Importantly, our adaptive algorithm guarantees the accuracy of the optimization solution even when an agent is isolated by its neighbors.


When to Update Your Model: Constrained Model-based Reinforcement Learning

Ji, Tianying, Luo, Yu, Sun, Fuchun, Jing, Mingxuan, He, Fengxiang, Huang, Wenbing

arXiv.org Artificial Intelligence

Designing and analyzing model-based RL (MBRL) algorithms with guaranteed monotonic improvement has been challenging, mainly due to the interdependence between policy optimization and model learning. Existing discrepancy bounds generally ignore the impacts of model shifts, and their corresponding algorithms are prone to degrade performance by drastic model updating. In this work, we first propose a novel and general theoretical scheme for a non-decreasing performance guarantee of MBRL. Our follow-up derived bounds reveal the relationship between model shifts and performance improvement. These discoveries encourage us to formulate a constrained lower-bound optimization problem to permit the monotonicity of MBRL. A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns. Motivated by these analyses, we design a simple but effective algorithm CMLO (Constrained Model-shift Lower-bound Optimization), by introducing an event-triggered mechanism that flexibly determines when to update the model. Experiments show that CMLO surpasses other state-of-the-art methods and produces a boost when various policy optimization methods are employed.


Event-triggered privacy preserving consensus control with edge-based additive noise

Liang, Limei, Ding, Ruiqi, Liu, Shuai

arXiv.org Artificial Intelligence

In this article, we investigate the distributed privacy preserving weighted consensus control problem for linear continuous-time multi-agent systems under the event-triggering communication mode. A novel event-triggered privacy preserving consensus scheme is proposed, which can be divided into three phases. First, for each agent, an event-triggered mechanism is designed to determine whether the current state is transmitted to the corresponding neighbor agents, which avoids the frequent real-time communication. Then, to protect the privacy of initial states from disclosure, the edge-based mutually independent standard white noise is added to each communication channel. Further, to attenuate the effect of noise on consensus control, we propose a stochastic approximation type protocol for each agent. By using the tools of stochastic analysis and graph theory, the asymptotic property and convergence accuracy of consensus error is analyzed. Finally, a numerical simulation is given to illustrate the effectiveness of the proposed scheme.